Project Title: Comparative Performance Analysis of Kolmogorov Arnold Network and Neural Network in Discovering Dynamics of Linear and Nonlinear Systems.
System identification is a characterizing process of a system where researchers or engineers build a mathematical block of the system's dynamics using the input and output data. Oftentimes, it is not possible to access the physical system and it is relatively less complex to build mathematical models for illustrating system's responses.
In this project, we will try to identify different linear and nonlinear systems using Kolmogorov-Arnold Network and compare the result with Neural Network based system identification models. We will consider one 3rd order linear time invariant system and one 2 nd order nonlinear system due to lack of computational resources.
Komogorov Arnold Network uses B-Splines as the weight and these weights are parametrized.
The whole notebook will cover the following sections:
| # | Topic |
|---|---|
| 01 | Data Generation, Visualization and Processing |
| 02 | ML Model Development & Learning |
| 03 | Derriving Model's Governing Equation |
| 04 | Conclusion |
Required Libraries: We will use following libraries throught this project:
- Numpy
- SciPy
- Python Symbolic Regressor
- Symbolic Python for mathematics
- Torch Optimizer
- Scipy
- Sci-Kit Learn Train Test Split
- Kolmogorov Arnold Network
- Matplotlib
- Plotly
- Warnings
Let us import the required libraries:
from models import *
from kan import *
from data_generation import *
from utils import *
from training import *
import torch.nn.init as init
import matplotlib.pyplot as plt
import plotly.graph_objects as go
from mpl_toolkits.mplot3d import Axes3D
# ignore user warnings
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
#Set default data type
torch.set_default_dtype(torch.float64)
%matplotlib inline
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)
Detected IPython. Loading juliacall extension. See https://juliapy.github.io/PythonCall.jl/stable/compat/#IPython cuda
Data Generation, Visualization and Processing
Linear System:
We considered the following 3rd order linear system for system identification -
$$ \dot{x} = -0.1 x - 2 y $$ $$ \dot{y} = 2 x - 0.1 y $$
$$ \dot{z} = -0.3 z $$
Nonlinear System:
We used predator pray model for nonlinear system identification in this projetc -
$$ \text{Prey: } \quad \dot{x} = 1.5 x - x y $$ $$ \text{Predator: } \quad \dot{y} = xy - 3 y $$
Data Generation
First, we defined the differential equations as a function so that we could solve them using SciPy Integrate and obtain the state values for each set. Then, we defined a lambda function to calculate the derivative of the states. Finally, we defined a function called create_dataset to generate the dataset for learning purposes.
The create_dataset function takes following inputs:
- System
- Lambda Function Associated with the system
- Initial State
- Number of samples
- Time Span
- function's output mode
- Device
- Seed
After generating the data, we will split the dataset into train and test data. We will use utils.py and data_generation.py to generate the dataset for both linear and nonlinear system.
Data Visualization
Linear System
Dataset Description
print(f"************************\n",
f"\nLinear Systems Dataset\n",
f"\n************************\n",
f"\nTotal Sample Size: {dataset_linear['train_input'].shape[0]}",
f"\nInput Sample Size: {data_linear['train_input'].shape[0]}",
f"\nTest Sample Size: {data_linear['test_input'].shape[0]}",
f"\nInput Features: {data_linear['train_input'].shape[1]}",
f"\nInput Features: {data_linear['test_label'].shape[1]}")
************************ Linear Systems Dataset ************************ Total Sample Size: 5000 Input Sample Size: 4000 Test Sample Size: 1000 Input Features: 3 Input Features: 3
Let us see, how the states and derrivative of states change with respect to time.
# Create subplots
fig, axs = plt.subplots(1, 2, figsize=(12, 6))
colors = ['r', 'g', 'b', 'm']
# Define plot configurations
plot_configs = [(0, 'Train Data', 'State'),
(1, 'Train Labels', 'Output')
]
# Plot input and label for each state dimension
for i in range(dataset_linear['train_input'].shape[1]):
axs[0].plot(t_linear,
dataset_linear['train_input'][:, i].cpu().numpy(),
color=colors[i % len(colors)],
label=f'State {i+1}')
axs[1].plot(t_linear,
dataset_linear['train_label'][:, i].cpu().numpy(),
color=colors[i % len(colors)],
label=f'Output {i+1}')
# Plot each subplot using the config
for sub_plot, data_class, data_labels in plot_configs:
axs[sub_plot].set_xlabel('Time Steps', fontsize=10)
axs[sub_plot].set_ylabel(f'Evolution of {data_labels} for {data_class}', fontsize=10)
axs[sub_plot].grid(True)
axs[sub_plot].legend()
plt.tight_layout()
plt.show()
# Extract data as numpy arrays
x_input = dataset_linear['train_input'][:, 0].cpu().numpy()
y_input = dataset_linear['train_input'][:, 1].cpu().numpy()
z_input = dataset_linear['train_input'][:, 2].cpu().numpy()
x_label = dataset_linear['train_label'][:, 0].cpu().numpy()
y_label = dataset_linear['train_label'][:, 1].cpu().numpy()
z_label = dataset_linear['train_label'][:, 2].cpu().numpy()
# Create traces
trace_input = go.Scatter3d(x=x_input, y=y_input, z=z_input, mode='lines',
line=dict(color='purple', width=3), name='Input States')
trace_label = go.Scatter3d(x=x_label, y=y_label, z=z_label, mode='lines',
line=dict(color='orange', width=3), name='Rate of Change of States')
# Layout
layout = go.Layout(title="3D Phase Space of the Linear System",
scene=dict(xaxis=dict(title='x'),
yaxis=dict(title='y'),
zaxis=dict(title='z')),
legend=dict(x=0.02, y=0.98), margin=dict(l=0, r=0, b=0, t=50))
# Figure
fig = go.Figure(data=[trace_input, trace_label], layout=layout)
# Show interactive plot
fig.show()
Nonlinear System
Dataset Description
print(f"************************\n",
f"\nLinear Systems Dataset\n",
f"\n************************\n",
f"\nTotal Sample Size: {dataset_nonlinear['train_input'].shape[0]}",
f"\nInput Sample Size: {data_nonlinear['train_input'].shape[0]}",
f"\nTest Sample Size: {data_nonlinear['test_input'].shape[0]}",
f"\nInput Features: {data_nonlinear['train_input'].shape[1]}",
f"\nInput Features: {data_nonlinear['test_label'].shape[1]}")
************************ Linear Systems Dataset ************************ Total Sample Size: 1000 Input Sample Size: 800 Test Sample Size: 200 Input Features: 2 Input Features: 2
Plotting Nonlinear Dataset
# Create subplots
fig, axs = plt.subplots(2, 2, figsize=(12, 8))
colors = ['r', 'g', 'b', 'm']
# Define plot configurations
plot_configs = [(0, 0, 'State x'),
(0, 1, 'State y'),
(1, 0, 'rate of change of x'),
(1, 1, 'rate of change of y'),
]
axs[0, 0].plot(t_nonlinear, dataset_nonlinear['train_input'][:, 0].cpu().numpy(), color = colors[0], label = 'x')
axs[0, 1].plot(t_nonlinear, dataset_nonlinear['train_input'][:, 1].cpu().numpy(), color = colors[1], label = 'y')
axs[1, 0].plot(t_nonlinear, dataset_nonlinear['train_label'][:, 0].cpu().numpy(), color = colors[2], label = r'$\dot{x}$')
axs[1, 1].plot(t_nonlinear, dataset_nonlinear['train_label'][:, 1].cpu().numpy(), color = colors[3], label = r'$\dot{y}$')
# Plot each subplot using the config
for row, col, data_labels in plot_configs:
axs[row, col].set_xlabel('Time Steps', fontsize=10)
axs[row, col].set_ylabel(f'Evolution of {data_labels}', fontsize=10)
axs[row, col].grid(True)
axs[row, col].legend()
plt.tight_layout()
plt.show()
# Extract data
x_input_nl = dataset_nonlinear['train_input'][:, 0].cpu().numpy()
y_input_nl = dataset_nonlinear['train_input'][:, 1].cpu().numpy()
z_input_nl = np.zeros_like(x_input_nl)
x_label_nl = dataset_nonlinear['train_label'][:, 0].cpu().numpy()
y_label_nl = dataset_nonlinear['train_label'][:, 1].cpu().numpy()
z_label_nl = np.zeros_like(x_label_nl)
# Create the trace
trace_input_nl = go.Scatter3d(x=x_input_nl, y=y_input_nl, z=z_input_nl, mode='lines',
line=dict(color='purple', width=3), name='States')
# Create the trace
trace_label_nl = go.Scatter3d(x=x_label_nl, y=y_label_nl, z=z_label_nl, mode='lines',
line=dict(color='green', width=3), name='Rate of Change of States')
# Layout
layout = go.Layout(title="3D Phase Space of the Nonlinear System",
scene=dict(xaxis=dict(title='x'), yaxis=dict(title='y'), zaxis=dict(title='z')),
margin=dict(l=0, r=0, b=0, t=50), legend=dict(x=0.02, y=0.98))
# Figure
fig = go.Figure(data=[trace_input_nl, trace_label_nl], layout=layout)
# Show interactive plot
fig.show()
ML Model Development & Learning
We deined the Neural Network models in models.py file. The file includes following models:
- Neural Network Model for Linear System
- Neural Network Model for Nonlinear System
KAN models are defined in this chapter.
The training.py file includes a trainer object that we will be using to train our models. Follwing table provides us the information about activation functions, optimizers, loss function, metric and learning rate scheduler for all models.
| No | Parameters | Kolmogorov Arnold Networks | Artificial Neural Network |
|---|---|---|---|
| 01 | Activation Function | Sigmoid Linear Unit for both Systems | Relu for both systems |
| 02 | Optimizer | Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) for both system | Adam for both system |
| 03 | Loss Function | Root Mean Squared Error for both system | Mean Squared Error for both system |
| 04 | Metric | None | MAE for both system |
Defining KAN Model for Linear System
The linear KAN model consists of two layers: input and output layer. The input layer takes in 3 features and outputs 3 features. The degree of the B-spline polynomial is 3, and the number of grids on the B-spline is also set to 3.
model_linear = KAN(width=[3,3], grid=3, k=3, seed=1, device=device, auto_save = True)
checkpoint directory created: ./model saving model version 0.0
We can plot the model’s initial diagram after passing the training data through it one time.
model_linear(data_linear['train_input'])
model_linear.plot(beta = 10,
in_vars=[r'$x_{}$'.format(i+1) for i in range(model_linear.width[0][0])],
out_vars = [r'$y_{}$'.format(i+1) for i in range(model_linear.width[-1][0])])
Training the Model:
We will train the KAN model for 100 steps.
steps = 100
history_linear = model_linear.fit(data_linear, opt="LBFGS", steps=steps, lamb=0.01)
| train_loss: 7.49e-02 | test_loss: 7.65e-02 | reg: 4.59e+00 | : 100%|█| 100/100 [00:41<00:00, 2.41
saving model version 0.1
Plot of Loss & Regularization Curve
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(history_linear['train_loss'], label = 'Train Loss')
plt.plot(history_linear['test_loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Train Vs Test Loss')
plt.grid()
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(history_linear['reg'], label = 'Regularization')
plt.xlabel('Epoch')
plt.ylabel('Regularization')
plt.title('Change in Regularization')
plt.grid()
plt.legend()
<matplotlib.legend.Legend at 0x210a4587770>
Model Diagram after Training
model_linear.plot(beta = 10,
in_vars=[r'$x_{}$'.format(i+1) for i in range(model_linear.width[0][0])],
out_vars = [r'$y_{}$'.format(i+1) for i in range(model_linear.width[-1][0])])
Model's Prediction & Comparison with True Labels
preds_linear = model_linear(dataset_linear['train_input'])
fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, sharex=True, figsize=(12, 4))
ax1.plot(t_linear, dataset_linear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_linear, preds_linear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()
ax2.plot(t_linear, dataset_linear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_linear, preds_linear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()
ax3.plot(t_linear, dataset_linear['train_label'][:, 2].cpu().numpy(), 'k--', label = 'True')
ax3.plot(t_linear, preds_linear[:, 2].detach().cpu().numpy(), 'r--', label = 'Pred')
ax3.legend()
<matplotlib.legend.Legend at 0x210a222ffb0>
From the loss curve and model plot, we see that the model is not doing well with polynomial degree of 3 and grid as 3. Let us initialize a more fine grained KAN with grid points 10 and train the model again.
model_linear = model_linear.refine(10)
result = model_linear.fit(data_linear, opt="LBFGS", steps=100, lamb=0.01)
saving model version 0.2
| train_loss: 6.36e-02 | test_loss: 6.60e-02 | reg: 4.57e+00 | : 100%|█| 100/100 [00:36<00:00, 2.75
saving model version 0.3
Since the loss has decreased, we can now refine the grids iteratively.
grids = np.array([3,10,20,50,100])
train_losses_linear = []
test_losses_linear = []
steps = 200
k = 3
for i in range(grids.shape[0]):
if i == 0:
model_linear = KAN(width=[3,3], grid=grids[i], k=k, seed=1, device=device)
if i != 0:
model_linear = model_linear.refine(grids[i])
results = model_linear.fit(data_linear, opt="LBFGS", steps=steps)
train_losses_linear += results['train_loss']
test_losses_linear += results['test_loss']
checkpoint directory created: ./model saving model version 0.0
| train_loss: 2.13e-04 | test_loss: 2.19e-04 | reg: 5.98e+00 | : 100%|█| 200/200 [01:08<00:00, 2.93
saving model version 0.1 saving model version 0.2
| train_loss: 1.41e-05 | test_loss: 1.56e-05 | reg: 5.96e+00 | : 100%|█| 200/200 [01:04<00:00, 3.10
saving model version 0.3 saving model version 0.4
| train_loss: 1.82e-06 | test_loss: 1.97e-06 | reg: 5.95e+00 | : 100%|█| 200/200 [01:16<00:00, 2.63
saving model version 0.5 saving model version 0.6
| train_loss: 3.47e-07 | test_loss: 3.66e-07 | reg: 5.95e+00 | : 100%|█| 200/200 [01:14<00:00, 2.70
saving model version 0.7 saving model version 0.8
| train_loss: 9.50e-08 | test_loss: 1.12e-07 | reg: 5.95e+00 | : 100%|█| 200/200 [01:28<00:00, 2.27
saving model version 0.9
RMSE Vs Steps in Grid Iteration:
Now, let us look at the training dynamics of the loss. It turns out that the training dynamics exhibit a staircase pattern, where the loss suddenly drops after each grid refinement.plt.plot(train_losses_linear)
plt.plot(test_losses_linear)
plt.legend(['train', 'test'])
plt.ylabel('RMSE')
plt.title("Linear System: Training Dynamics of Loss")
plt.xlabel('step')
plt.yscale('log')
RMSE Vs Number of Parameters in KAN
n_params_linear = 4 * (grids + 3)
train_vs_G_linear = train_losses_linear[(steps-1)::steps]
test_vs_G_linear = test_losses_linear[(steps-1)::steps]
plt.plot(n_params_linear, train_vs_G_linear, marker="o")
plt.plot(n_params_linear, test_vs_G_linear, marker="o")
plt.plot(n_params_linear, 100*n_params_linear**(-4.), ls="--", color="black")
plt.xscale('log')
plt.yscale('log')
plt.legend(['train', 'test', r'$N^{-4}$'])
plt.xlabel('number of params')
plt.ylabel('RMSE')
plt.title("Linear Systems: Model's Parameter Vs RMSE")
Text(0.5, 1.0, "Linear Systems: Model's Parameter Vs RMSE")
Model's Final Diagram:
The model looks like the follwing after grid refinement:model_linear.plot(beta = 10,
in_vars=[r'$x_{}$'.format(i+1) for i in range(model_linear.width[0][0])],
out_vars = [r'$y_{}$'.format(i+1) for i in range(model_linear.width[-1][0])])
Final Model's Prediction and Comparison with True Labels
preds_linear = model_linear(dataset_linear['train_input'])
fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, sharex=True, figsize=(12, 4))
ax1.plot(t_linear, dataset_linear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_linear, preds_linear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()
ax2.plot(t_linear, dataset_linear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_linear, preds_linear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()
ax3.plot(t_linear, dataset_linear['train_label'][:, 2].cpu().numpy(), 'k--', label = 'True')
ax3.plot(t_linear, preds_linear[:, 2].detach().cpu().numpy(), 'r--', label = 'Pred')
ax3.legend()
<matplotlib.legend.Legend at 0x210b3204bc0>
We will follow the same steps for learning nonlinear dynamics.
Learning Nonlinear Systems with KAN
The nonlinear KAN model has 3 layers: input, hidden and output. It takes in two features and produces 2 features in the output layer. The hidden layer has 4 nodes. The number of B-Spline polinomial is set to 3 and the number of grid points is 5 for this model.
Defining Model
model_nonlinear = KAN(width=[2, 4, 2], grid=3, k=3, seed=1, device=device, auto_save = True)
checkpoint directory created: ./model saving model version 0.0
Initial Model Plot
model_nonlinear(data_nonlinear['train_input'])
model_nonlinear.plot(beta = 10,
in_vars=[r'$x_{}$'.format(i+1) for i in range(model_nonlinear.width[0][0])],
out_vars = [r'$y_{}$'.format(i+1) for i in range(model_nonlinear.width[-1][0])])
Training the Model
history_nonlinear = model_nonlinear.fit(data_nonlinear, opt="LBFGS", steps=100, lamb=0.01)
| train_loss: 4.99e-02 | test_loss: 5.04e-02 | reg: 1.72e+01 | : 100%|█| 100/100 [01:00<00:00, 1.65
saving model version 0.1
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(history_nonlinear['train_loss'], label = 'Train Loss')
plt.plot(history_nonlinear['test_loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Train Vs Test Loss')
plt.grid()
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(history_nonlinear['reg'], label = 'Regularization')
plt.xlabel('Epoch')
plt.ylabel('Regularization')
plt.title('Change in Regularization')
plt.grid()
plt.legend()
<matplotlib.legend.Legend at 0x2109d1da8d0>
Model Diagram after Training
model_nonlinear.plot(beta = 10,
in_vars=[r'$x_{}$'.format(i+1) for i in range(model_nonlinear.width[0][0])],
out_vars = [r'$y_{}$'.format(i+1) for i in range(model_nonlinear.width[-1][0])])
Let's initialize a more fine-grained KAN with G=10 to see wether the loss goes gown or not.
model_nonlinear = model_nonlinear.refine(10)
result = model_nonlinear.fit(data_nonlinear, opt="LBFGS", steps=100, lamb=0.01)
saving model version 0.2
| train_loss: 1.84e-02 | test_loss: 1.87e-02 | reg: 1.69e+01 | : 100%|█| 100/100 [01:08<00:00, 1.46
saving model version 0.3
As the loss goes down, let us train more fine-grained KAN iteratively.
grids = np.array([3,10,20,50,100])
train_losses_nonlinear = []
test_losses_nonlinear = []
steps = 250
k = 3
for i in range(grids.shape[0]):
if i == 0:
model_nonlinear = KAN(width=[2,4,2], grid=grids[i], k=k, seed=1, device=device)
if i != 0:
model_nonlinear = model_nonlinear.refine(grids[i])
results = model_nonlinear.fit(data_nonlinear, opt="LBFGS", steps=steps)
train_losses_nonlinear += results['train_loss']
test_losses_nonlinear += results['test_loss']
checkpoint directory created: ./model saving model version 0.0
| train_loss: 1.63e-03 | test_loss: 1.66e-03 | reg: 2.50e+01 | : 100%|█| 250/250 [01:54<00:00, 2.19
saving model version 0.1 saving model version 0.2
| train_loss: 9.99e-05 | test_loss: 9.43e-05 | reg: 2.50e+01 | : 100%|█| 250/250 [01:54<00:00, 2.18
saving model version 0.3 saving model version 0.4
| train_loss: 1.18e-05 | test_loss: 1.38e-05 | reg: 2.50e+01 | : 100%|█| 250/250 [01:55<00:00, 2.17
saving model version 0.5 saving model version 0.6
| train_loss: 2.29e-06 | test_loss: 2.87e-06 | reg: 2.50e+01 | : 100%|█| 250/250 [01:49<00:00, 2.28
saving model version 0.7 saving model version 0.8
| train_loss: 1.25e-06 | test_loss: 3.43e-06 | reg: 2.50e+01 | : 100%|█| 250/250 [02:04<00:00, 2.01
saving model version 0.9
RMSE Vs Steps in Grid Iteration
plt.plot(train_losses_nonlinear)
plt.plot(test_losses_nonlinear)
plt.legend(['train', 'test'])
plt.ylabel('RMSE')
plt.title("Nonlinear System: Training Dynamics of Loss")
plt.xlabel('step')
plt.yscale('log')
RMSE Vs Number of Parametrs in KAN
n_params_nonlinear = 16 * (grids + 3)
train_vs_G_nonlinear = train_losses_nonlinear[(steps-1)::steps]
test_vs_G_nonlinear = test_losses_nonlinear[(steps-1)::steps]
plt.plot(n_params_nonlinear, train_vs_G_nonlinear, marker="o")
plt.plot(n_params_nonlinear, test_vs_G_nonlinear, marker="o")
plt.plot(n_params_nonlinear, 100*n_params_nonlinear**(-4.), ls="--", color="black")
plt.xscale('log')
plt.yscale('log')
plt.legend(['train', 'test', r'$N^{-4}$'])
plt.xlabel('number of params')
plt.ylabel('RMSE')
plt.title("Nonlinear Systems: Model's Parameter Vs RMSE")
Text(0.5, 1.0, "Nonlinear Systems: Model's Parameter Vs RMSE")
Model's Final Diagram
model_nonlinear.plot(beta = 10,
in_vars=[r'$x_{}$'.format(i+1) for i in range(model_nonlinear.width[0][0])],
out_vars = [r'$y_{}$'.format(i+1) for i in range(model_nonlinear.width[-1][0])])
Final Model's Prediction and Comparison with True Labels
preds_nonlinear = model_nonlinear(dataset_nonlinear['train_input'])
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharex=True, figsize=(12, 4))
ax1.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_nonlinear, preds_nonlinear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()
ax2.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_nonlinear, preds_nonlinear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()
<matplotlib.legend.Legend at 0x2112993a390>
Learning Linear Systems with Neural Network
NN_linear =NNM_liniear(input_size = 3, output_size = 3, hidden_size = 32, num_layers = 2)
NN_linear.to(device)
NNM_liniear( (fc): Linear(in_features=3, out_features=3, bias=True) )
Metrics = metrics()
optimizer_linear = optim.AdamW(NN_linear.parameters(), lr=1e-3)
criterion = torch.nn.MSELoss()
# Define ReduceLROnPlateau scheduler
scheduler_linear = ReduceLROnPlateau(optimizer_linear,
mode='min',
factor=0.7,
patience=5)
NN_Trainer_linear = trainer(model = NN_linear,
trainloader = trainloader_linear,
valloader = valloader_linear,
num_epochs = 280,
optimizer = optimizer_linear,
loss_func = criterion,
metrics = Metrics,
device = device,
scheduler = scheduler_linear
)
NNM_history_linear = NN_Trainer_linear.fit()
100%|██████████| 280/280 [01:13<00:00, 3.80it/s, Train_MAE=6.06e-7, Train_loss=4.67e-16, Val_MAE=1.17e-6, Val_loss=4.41e-16]
Train Loss 0.0000, Val Loss: 0.0000, Train MAE: 0.0000, Val MAE: 0.0000, Train Adjusted R²: 1.0000, Val Adjusted R²: 1.0000
# Plot training and validation loss and accuracy
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(NNM_history_linear['Average Train Loss'], label = 'Train Loss')
plt.plot(NNM_history_linear['Average Val Loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Vs Validation Loss')
plt.grid()
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(NNM_history_linear['Average Train MAE'], label = 'Average Train MAE')
plt.plot(NNM_history_linear['Average Val MAE'], label = 'Average Validation MAE')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.title('Training Vs Validation MAE Comparison')
plt.grid()
plt.legend()
<matplotlib.legend.Legend at 0x21105a228d0>
Plotting the Neural Network's Prediction for Linear System
NNM_preds_linear = NN_linear(dataset_linear['train_input'])
fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, sharex=True, figsize=(12, 4))
ax1.plot(t_linear, dataset_linear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_linear, NNM_preds_linear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()
ax2.plot(t_linear, dataset_linear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_linear, NNM_preds_linear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()
ax3.plot(t_linear, dataset_linear['train_label'][:, 2].cpu().numpy(), 'k--', label = 'True')
ax3.plot(t_linear, NNM_preds_linear[:, 2].detach().cpu().numpy(), 'r--', label = 'Pred')
ax3.legend()
<matplotlib.legend.Legend at 0x2110a7af9e0>
Learning Nonlinear Systems with Neural Network
NN_nonlinear = NNM_nonlinear(input_size=2, output_size=2, hidden_size=16, seed=1)
NN_nonlinear.to(device)
NNM_nonlinear( (fc1): Linear(in_features=2, out_features=16, bias=True) (fc2): Linear(in_features=16, out_features=2, bias=True) )
optimizer_nonlinear = optim.AdamW(NN_nonlinear.parameters(), lr=1e-3)
# Define ReduceLROnPlateau scheduler
scheduler_nonlinear = ReduceLROnPlateau(optimizer_nonlinear,
mode='min',
factor=0.7,
patience=4)
NN_Trainer_nonlinear = trainer(model = NN_nonlinear,
trainloader = trainloader_nonlinear,
valloader = valloader_nonlinear,
num_epochs = 500,
optimizer = optimizer_nonlinear,
loss_func = criterion,
metrics = Metrics,
device = device,
scheduler = scheduler_nonlinear
)
NNM_history_nonlinear = NN_Trainer_nonlinear.fit()
100%|██████████| 500/500 [00:32<00:00, 15.34it/s, Train_MAE=0.149, Train_loss=4.15, Val_MAE=0.148, Val_loss=3.83]
Train Loss 4.1487, Val Loss: 3.8266, Train MAE: 0.1495, Val MAE: 0.1478, Train Adjusted R²: 0.7373, Val Adjusted R²: 0.7434
NNM_history_linear.keys()
dict_keys(['Average Train Loss', 'Average Train MAE', 'Average Train R2', 'Average Train Adjusted R2', 'Average Val Loss', 'Average Val MAE', 'Average Val R2', 'Average Val Adjusted R2'])
# Plot training and validation loss and accuracy
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(NNM_history_nonlinear['Average Train Loss'], label = 'Train Loss')
plt.plot(NNM_history_nonlinear['Average Val Loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Vs Validation Loss')
plt.grid()
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(NNM_history_nonlinear['Average Train Adjusted R2'], label = 'Average Train Adjusted R Squared')
plt.plot(NNM_history_nonlinear['Average Val Adjusted R2'], label = 'Average Validation Adjusted R Squared')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.title('Training Vs Validation Adjusted R Squared Comparison')
plt.grid()
plt.legend()
<matplotlib.legend.Legend at 0x211058ea8d0>
Plotting the Neural Network's Prediction for Nonlinear System
NNM_preds_nonlinear = NN_nonlinear(dataset_nonlinear['train_input'])
NNM_preds_nonlinear.shape, t_nonlinear.shape
(torch.Size([1000, 2]), (1000,))
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharex=True, figsize=(12, 4))
ax1.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_nonlinear, NNM_preds_nonlinear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()
ax2.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_nonlinear, NNM_preds_nonlinear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()
<matplotlib.legend.Legend at 0x2109d1e5f10>
Derriving Model's Governing Equation
Now, the Symbolic KAN Layer provided by the KAN library supports only univariate functions. However, the predator-prey model includes a nonlinear term (xy). As a result, the default Symbolic KAN Layer is unable to produce the differential equations of the predator-prey system. To address this limitation, we will use Python Symbolic Regression (PySR) to discover the governing equations represented by the model.
To derrive the governing equation using PySR, we need to follow following steps:
- Define Python Symbolic Mappings
- Define Symbolic Regressor
- Define Inputs and Outputs of the Symbolic Regressor and
- Fit the Regressor to produce model's governing equations.
Our defined Python Symbolic Mappings and Symbolic Regressor is in model.py file.
Governing Equations of KAN Model for Linear System
X_linear = dataset_linear['train_input']
KAN_equations_Linear = Equation_Generator(model_linear, dataset_linear['train_input'], variable_names=["x", "y", "z"])
🔍 Fitting Model for it's Governing Equtions... ✅ Completed Symbolic Regression for the Given Model.
simplify(KAN_equations_Linear[0].equation)
simplify(KAN_equations_Linear[1].equation)
simplify(KAN_equations_Linear[2].equation)
Governing Equations of KAN Model for Nonlinear System
KAN_equations_nonlinear = Equation_Generator(model_nonlinear, dataset_nonlinear['train_input'], variable_names=["x", "y"])
🔍 Fitting Model for it's Governing Equtions... ✅ Completed Symbolic Regression for the Given Model.
simplify(KAN_equations_nonlinear[0].equation)
simplify(KAN_equations_nonlinear[1].equation)
Governing Equations of Neural Network Model for Linear System
NN_equations_linear = Equation_Generator(NN_linear, dataset_linear['train_input'], variable_names=["x", "y", "z"])
🔍 Fitting Model for it's Governing Equtions... ✅ Completed Symbolic Regression for the Given Model.
simplify(NN_equations_linear[0].equation)
simplify(NN_equations_linear[1].equation)
simplify(NN_equations_linear[2].equation)
Governing Equations of Neural Network Model for Nonlinear System
NN_equations_nonlinear = Equation_Generator(NN_nonlinear, dataset_nonlinear['train_input'], variable_names=["x", "y"])
🔍 Fitting Model for it's Governing Equtions... ✅ Completed Symbolic Regression for the Given Model.
simplify(NN_equations_nonlinear[0].equation)
simplify(NN_equations_nonlinear[1].equation)
Conclusion
Using Kolmogorov–Arnold Networks, we can approximate the differential equations of both linear and nonlinear systems. In contrast, Neural Networks can only approximate the linear system; they fail to accurately approximate the nonlinear system. Below is a comparison table for system identification:
| Original System Dynamics | Derrived Using KAN | Derrived Using Neural Network |
|---|---|---|
| $\dot{x} = -0.1 x - 2 y\\\dot{y} = 2 x - 0.1 y\\ \dot{z} = -0.3 z$ | $\dot{x} = -0.1x − 2.0y\\ \dot{y} = 2.0x − 0.1y\\ \dot{z} = −0.3z$ | $\dot{x} = −0.1x −2y\, \text{ (Approx) }\\ \dot{y} = 2.0x − 0.1y\, \text{ (Approx) }\\ \dot{z} = −0.3z\, \text{ (Approx) }$ |
| $\dot{x} = 1.5 x - x y\\ \dot{y} = xy - 3.0y $ | $\dot{x} = 1.5x - xy\\ \dot{y} = xy - 3.0y$ | $\dot{x} = 0.0216x − 2.9617y + 4.3669\, \text{ (Approx) }\\ \dot{y} = 1.4819x - 0.0303y - 4.3944\, \text{ (Approx) }$ |
Kolmogorov-Arnold Network is more interpretable than any other AI models and more efficient in system identification. With more computational power, it will be able to perform system identification of higher order dynamical systems.